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ABSTRACT " 

The test performance of students who took the 
Scholastic Aptitude Test (SAV) only once as juniors' was, contrasted 
with students who took the test as juniors and again as seniors. 
Estimates of expected test performance on a common initial) 
administration in the junior year were derived from separate equating 
sections and background variables. Residuals of observed minus 
expected test scores revealed statistically significant differences 
between students who took a single administration- of the SAT as „ 
juniors and those who took the same initial administration but also 
repeated the test as seniors. The initial observed scores of students 
later repeating the test were consistently lower than their expected 
scores 'for both the verbal and mathematical sections. The results 
indicate that self-selection occurs when students decide to repeat a 
test. Score changes among these students reflect negative errors of 
\measurement on the initial test administration. (Author/DWH) i 
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ABSTRACT 



Studer.t self-selection in deciding to repeat a test was examined by contrasting the test 
performance of students j taking the College Board's Scholastic Aptitude Test (SAT) as 
juniors and again as seniors with the test performance of students taking the SAT only 
«once'as juniors. Estimates of expected* test performance on a commtfh initial administra- 
tion in the junior year were derived from separate equating sections "and background 
variables. Residuals of observed njinus expected Jtest scores revealed statistically 
significant differences between students who took a single administration of the SAT 
as juniors and students who took the same initial administration but also repeated 
the test as seniors; the initial, observed scores of students later repeating the test 
were consistently lower than their expected scores for both the verbal and mathematical 
sections. These results indicate that self-selection occurs when students decide to 
repeat a test and that score changes among these students reflect negative errors of 
measurement on^the initial test administration as well as other factors. - "1 



STUDENT SELF-SELECTION AND TEST REPETITION 



The extent of score change from one administration of a test, to another administration 
of the same test is ofte1rt>taken as evidence of the effectiveness of a particular inter- - 
vention or of the growth „ among certain individuals* Problems inherent to the use and 
interpretation of simple differences in assessing program impact; or individual differences 
have received considerable attention* Cronbach and Furby (1970) and Linn and SXinde 
(1977) provide excellent critical discussions of difference scores and alternative ap- 
proaches to measuring change* A further, special case in which test-retest score differ- 
ences may misrepresent actual change arises when test candidates* decide for themselves 
whether or not they should repeat a test. Under such circumstances it is to be expected 
that errors of measurement on the initial test administration would influence candidates' 
decisions regarding retesting. 

- Each year hundreds of thousands of applicants to schools and colleges elect to repeat 
an admissions test which they had taker, earlxer* High school students who have taken the 
Scholastic Altitude Test (SAT) as juniors, for example, may decide to take the test again 
as seniors* Student self-selection then becomes a possible component of scire change. 
If students decide to repeat a test because they perceive their initial scores as .under- 
estimates of their true abilities, usual assumptions about the distribution of errors of 
measurement on the "initial test administration may not hold for this group. There would 
be a nonzero and presumably negative mean for the errors of measurement leading to ob- 
served scores lower than true scores. Conversely, students electing not to repeat a test 
would be those whose observed scores included a nonzero and positive mean for errors of 
measurement on the test. 

This study contrasts the test performance of students taking the SAT as juniors and 
,again as seniors with the test performance of students taking the SAT only once as juniors 
Estimates of expected test performance on a common initial administration in the junior 
year were derived txom separate equating sections and background variables* Administra- 
tions of the SAT regularly include a variable experimental section devoted to^ equating 
scores or pretesting items; scores on this experimental section do not enter into the 
reported verbal or mathematical scores. Thus, separate and independent equating sections 
provide a basis for determining whether errors of measurement in scores on reporting 
sections influence student decisions to retake a test. 

. :• i | • 

METHOD 



Samples of two groups of students were drawn from SAT history files: students who had 
taken the SAT only once and for the first* time in their junior >enr and students who had 
taken the same .initial test administration in their junior year and then repeated the test 
in their senior year. The administrate or. of the SAT from May 1979 was the initial test 
common to the two groups as juniors, and the repeaters had also taken' the SAT in November 
1979 as seniors. Four of the 10 variable experimental! sections randomly; distributed in 
the common initial administration from May 1979 were verbal or mathematical equating 
sections, and only students whose records included these equating sections became part 
of the samples. Also, students in the repeater group were those who had first taken the J 
SAT in May 1979 as juniors and again in November 1979 as seniors without any intervening 
administrations of the test* 

Under the assumption that a student's decision to retake a test is independent of the 
error of his or her reporting sections, estimates qf expected test performance based on 
equating sections and background variables for students with a single test administration 
should also fit the test performance of students with a subsequent, repeat test adminis- 
tration. The samples of students with SAT results as jiiniors only^and students with SAT 
results as both juniors ahdjseniors were split according to whether the equating section 
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on their initial test administration had been either a verbal or a , mathematical section. 
Estimates of expected verbal scores from reporting sections were based on a least-squares 
multiple regression of observed verbal scores on verbal equating sections and background 
variables for students with a single test administration. The verbal equating score was , 
expressed as a standard score, based on .the particular section's mean and standard " 
deviation, since raw or formula scores would differ from one verbal equating section to 
another. Background variables were taken from ^the Student Descriptive Questionnaire (SDQ) _ 
completed by students when registering for the SAT given in May 1979, The variables * 
included: high school rank; years "of English study; latest English'grade; years of 
mathematics study; latest mathematics grade; educational degree aspirations; father's 
level of education; mother's level o| education; and public/nonpublic high school. The 
same procedure was followed for observed mathematical scores. with students who had taken 
a mathematical equating section, 

. The coefficients for each term in the above regressions, one set of coefficients for 
expected verbal scores and another for expected mathematical scores, were established and 
validated with students who had taken a single ladrainistration of the SAT in May 1979 as 
juniors. Roughly one-third of such students with a verbal equating section served as the 
sample for establishing the regression coefficients, and the other two- thirds of such < 
students with a verbal equating section served as cross-validation samples. Because the 
scores of students with complete SDQ responses differ from the scores of students with 
incomplete SDQ responses, a .maximum likelihood algorithm (DempsLer, Laird, and Rubin, 
1977) was used in establishing regression coefficients with incomplete data for background 
variables. Students who had taken a single administration of the SAT in May 1979 as juniors 
and had a mathematical equating section were also split into thirds for establishing and 
validating another set of regression coefficients for expected mathematical scores. The 
distribution of residuals for observed scores minus expected scares should be equivalent 
in the regression and cross-validation samples of students who nad taken a single test 
administration, \ 

Estimates of expected scores on the same initial test administration, the SAT given 
in May 1979, for students later repeating the test were based on these sets of regression 
coefficients. The group of students with a repeat test administration was split according 
to equating section, verbal or mathematical, and then divided again into thirds in order 
to check on the distribution of residuals within the group. Finally, the mean residuals 
between observed and expected scores on their initial test administration were compared 
for students with a single administration and students with a repeat tiest administration. 



RESULTS AND DISCUSSION 

I 

A total of 253,354 test candidates took the SAT in May 1979. Most of these examinees 
(88 percent) were juniors in high school, and roughly ,one-t»iird (32 percent) were juniors 
who also took the SAT in November 1979 as seniors. Approximately 32,000 examinees were 
juniors who took the S^T f dr the first and only time in May 1979 and also had a verbal or 
a mathematical equating section. A comparable number of examinees with a verbal or a 
mathematical equating section were juniors also taking the SAT for the first time but who 
'later repeated the test in November 1979 as seniors. Table 1 shows the means and standard 
deviations of the test performance on reporting sections and equating sectipns for these 
groups. These descriptive statistics and all other results presented here refer to the 
initial SAT in May 1979 taken both by students with a single test administration as juniors 
and by students with the same test administration as juniors as well as a later repeat test 
administration as seniors. Students taking the SAT only once as juniors* had slightly 
higher and more dispersed scores on both reporting and equating sections tHfen did repeaters. 
There were also some slight differences in the descriptive profiles of the two groups: 
somewhat higher percentages of those students who subsequently repeated the test come from 
college preparatory programs, had taken three or more years of mathematics, and planned to 
attain at least a bachelor's degree (see Appendix A), 9 
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'TABLE 1. Means and Standard 'Deviations of Test Performance 

i . 



' SAT-Verbal 



SAT-Mathematical 



Group 


N 


Mean 


sd 


Mean 


sd 


Total examinees (May 1979) 


253,354 


432 


107 


478 


113 


Junior examinees (May 1979)^ 


223,394 


439 


105 


486 


111 ) 


Junior repeaters " " 
(May-November 1979) 


81,959 ' 


437 * 


97 


483 


104 


Juniors with single test administration 
Verbal equating Section A 
Verbal equating Section B 
Mathematical equating Section C 
Mathematical equating Section D 


31,912 
• 8,010 
8,112* 
7,877 
7,906 


439.74 
16.71 
, 15. 7o " 


113.28 
8.39 
•8.37 


^\ 484. 20 
J10.22 

Jf 9,82 


118.70 

6.28 
5.50 


Juniors with repeat test administration 
Verbal equating Section A , 
Verbal equating Section B 
^Mathematical equating Section C 
Mathematical equating Section D 


31,971 
8,158 
8,017 
8,017 
7,777 


435.13- 
16.65 
15.36 


97.87 
7.46 
7.60 


479 . 13 

10.19 
9.66 


104.52 

5.56 
5.03 



Correlations of equating sections and reporting sections appear in Table 2. The high 
correlation of the verbal equating sections, Sections A and B, with observed verbal scores 
and of mathematical equating sections, Sections C and D, with observed mathematical scores 
suggests that equating sections can provide good estimates of expected, scores . Indeed, 
the multiple correlations resulting from a regression of observed scores on equating 
scores and background variables, R = .89 for SAT-Verbal and R = .88 for' SAT-Mathematical 
(see Appendix B), barely surpass the respective simple correlations among students with 
a single test administration. The lower pattern of intercorrelations found in Table 4L 
among students with a repeat test administration compared to students with a single t*st 
administration is consistent with the somewhat lower # test reliabilities for the former 
group (lie., alpha reliability estimates of .91 and .93 for verbal scores and, .90 and .92 
for mathematical scores for the two respective groups). The standard error of measurement 
for verbal scores was 30 points on the 200-800 SAT scale for both groups and for mathe- ■ 
matical scores 33 points for both groups. 



TABLE 2. Correlations ofEquating Sections and Reporting Sections 



Group 



Reporting 
Section 



< Single test administration 
Repeat test administration 



SAT-M 



Equating Section 
A B C D 



SAT-V 


.73 


.88 


.88 


.68 


.67 


SAT-M 




.68 


.68 


.87 


.86 


SAT-V 


.64 


.85 


.85 


.60 


.58 


SAT-M 




.61 


.61 


.84 


.83 



St 



TABLE 3. Means a$d Standard Deviations of Residuals from Predicted Performance 



• SAT-Verbal SAT-Mathematical - 

—Group * N~ "Mean [ sd " N Mean m sd 

Single test administration 

^Regression sample 4,497 0.18 51.65 4,374 1\29 56.63 

. Cross-validation sample 4,473 1.25 50.69 \ 4,481 -0.58 55.81 

Cross-validation sample . 4,385 -0.86 50.49 4,332 1.07 54.77 

Total 13,355 0.20 50.96 13,187 0.58 55.75 

Repeat test administration 

Comparison sample 4,143 -5.62 ■ 49.25 , 4,129 -10.01 54.67 

Comparison sample 4,186 ' -4.88 49.64 3,980 -8.54 55.69 

Comparison sample ; 4,109 -4.02 49.63 ^ 4^126* -8.99 53.31 

Total 12,438 -4.84 49.51 12,*235 -9.19 54.55 



Regression estimates of expected scores were based on the relationship of observed 
scores to equating, scores and background variables among students who had taken the SAT 
only once as juniors. The coefficients for independent variables and the. constant term 
established for calculating these regression estimates are given in Appendix B. Table 3 , 
presents a summary of the residuals reflecting the difference between observed scores and 
expected scores. Because regression coefficients were based on incomplete data and resid- 
uals calculated only for students with complete data, there is a nonzero mean residual in 
the regression samples. Within the group of students who had taken the SAT only once as 
juniors in May 1979 there was no significant difference ia the mean residual for the re- 
gression sample and the cross-validation samples on either varbal scores, £(2,13352) * 
1.89, £ > .15, or mathematical scores, F(2, 13184) = 1.48, £ > .20. Within the group of 
students who had taken the SAT for the first time in. May 1979 as juniors and again in 
November 1979 as seniors tnere was no significant difference in the mean residual across 
•three independent comparison samples on either verbal scores, £(2,12435) = 1.08, £> .30, 
or mathematical scores, F(2, 12232) = .782, £ > .45. there were, however, significant 
differences in me*n residuals between groups for both verbal scores, _t(25791) = 8.05, 
£.> .001, and mathematical scores, _t(25420) » .14.11, £> .001. The observed scores of 
students later repeating the test were lower than the scores expected for their initial 
test administration based on their performance on an equating section and their background 
characteristics. 

These results suggest that there is student self-selection in test repetition. Ap- 
parently, students electing to repeat an admissions test do so in part because they per- 
ceive their initial test scores on reporting sections as underestimates of their true 
abilities. Estimates of expected scores derived from equating sections and background - 
variables tend to confirm these student perceptions. Such self-selection in tjest rep- 
etition would lead to a nonzero, negative sum of errors of measurement on repeaters! 
initial test scores which would, in turn, distort the magnitud^of score changes and 
preclude the application of existing models for measuring change (e.g., Lor 4,, 1963). 
These findings would also seem to increase the likelihood that the student self-selection 
posited in other contexts (e.g., Messick, 1980) is an important factor in score change. 

The amount of score change on the SAT attributable to errors of measurement remains 
unclear. Differences in the mean residuals reported here, five points for verbal scores 
and 10 'points for mathematical scores, reflect both positive errors among students with 
a single test administration and negative errors among students with a repeat test admin- 
istration, and so may represent an overestimati Yet some students undoubtedly take the 
SAT only once or retake the test regardless of "h^ir initial scores. Such prejudgments 



ERJC" ' 20 



r 



■V 



would lessen the effects of measurement error on score change. It does seem clear >w ho\f- 
ev«r, that simple score gains or losses from one administration of an admissions test to 
another misrepresent change by failing to take student self^selecrion and other factors 
into account. 
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APPENDIX A: Descriptive Profiles of Student Groups 



' Students with 
Single Test Administration 
, (N-31,912) 
Frequency Percent 



Repeat Test 



Students with 

Administration 
<N«31,971) 
Frequency Percent 



; 

^Secondary School: 
Public 
Nonpublic - 

High School Program: 

Academic (college preparatory) 

General. > 

Career (business, technical) 

Other 

High School Class Size: ' 
Less than 100 students. 
100-249 students 
250-499 students 
500-W? students 
More than 750 students 

High School Rank: % 
9 Highest tenth 
Second ten tit- 
Second fifth 
Middle fifth 
Fourth fifth 
Lowest fifth 

Years of English: 
None 
One year 
Two years 
Three years 
- **Four years j , *• 

More than four years 

Years of Mathematics: 
None 
One year 
Two years 
Three years 
Eour years 

More than four years 

Host Recent English Grade: 
Excellent (90-J00, A) _ 
Good (80-89, B.) 
Fair (70-79, C) 
Passing (60-69, D) 
F^ing (below 60; F)" 

li ■. 

■ ■ \ 



23,113 
5,213 



22,814 
3,907 
1,843 
109 

\ 

\ 

2,310 
6,747 
8,848 
5,295 
- 5,372 

f 

I 

6,425 
6,081 
7,082 
6,995 
781 
. 128 



92 
514 
-J3,022 
8,002~ 
14,326 
2,894 



74.50 
16.34 



71.49 
12.24 
5.78 
0.34 



7.24 
21.14 
27.73 
16.59 
16.83 



70.13 
19. (ft. 
22.19 
21.92 
2.45 
0,40 



22,642 
.6,348 



,25,659 
2,380 
657* 
■ 64 



2,501 
6,456 
9,863 
5,005 
4/809 



6,539 
6,601 
7,368 
6,060 
544 
80 



70.82 
19.86 



80.26 
7.44 

. 1*05 
0.20 



7.82 
20.19 
30.85 
15.65 
15.04 



20.45 
20.65 
23-OS 
18:95 
1.70 
0.25 



51 


. 0.16 


20 „ 


0.06 


208 


0.65 


: -152 


0.48 


268 


0.84 ' 


* 158 


0.49. 


1,813 * 


5.68 


1 956* 


2.99 


23,795 


74/56 


24,575 


76.87 


2,743 


8.60 


- 2,S92 


9.26 



0.29 
1.61 
9.47 
25.08 
44.89 
9.07 



9,828 30 .'80 

13,420 42.05 

4,874 V< 15.27 

561* ' 1.76 

61 - 0.19 



34 
145 
1,260 > 
6,172 
17,869 
3,35*3 



9,931 
14,435 
4,021 
302 
26 



0.11 
0.45 
3.94 

J9.30 
*55.89 

10.49 



*31. 06 
45.*15 
12.57 
0.94 
0.08 



(continued) 



APPENDIX A: Descriptive Profiles of Student Groups (continued) 



1 

§ Students with 
Single Test Administration 
(N=31,912) 
Frequenjy?j__JPercentr 



* Students with, 

Repeat Test Administration - 

' 

Frequency Percent 



st Recent Mathematics Grade: 

Excellent (90-100, A) ' 8,152 

Good (80-89, | B) 11.021 

Fair (70-79, C) 7,508 s 

Passing (60-69, D) 1,777 

Failing (below 60, F) 235 

\ 

Part-time Employment: 

> None _ « 12,405 ■ 

JLess than ST hours per week \ 2.674 

l * 6-10 hours per week 2,976^ 

11-15 hours per week 3,398 

16-20 hours per week 3,943 

21-25 hours per week 2,060 

26-30 hours per week 830 

More than 30 hours per week.- — 395~~ 

Educational Aspirations: 

4 Two-year specialized training i 

program 1 ,320 

Two year associate's degree 788 

Bachelor's degree ' 8,fr84 

' Master's degree 6,243 

Professional degree 4,395 



25.55 
34.54 
23.53 
*A 5.57 
0.74 



38.87 
' 8.38 
9.33 
10.65 
12.36 
0.46 
2.60 
1.24 



4.14 
2.47 
26.90 
19\56 
13.77 



\ 



\ 



8,550 
12,019 
6,624 
1,308 
- 141 



1^2,260 
2,926 
3,434 
3,707 
3,772 
l,7tf7 
614 
251 



479- 
361 
9,326 
7,264 
'5,211 



26.74 
37.59 
20.72 
4.09 
0.44 



38.35 
9.i5 
10.74 
11.59 
11.80 
5.34 
1.92 
0.79 



1.50 
1.13 
29.17 
22.72 
16.30 





United States Citizenship: 








Yes 


28,325 


88.76 




No ¥ 


629 


1.97 




. " Armed Forces" Veteran: 








Yes ; 


153 


0.48 




No 


28,637 


89. 7 k 




Ethnic Group/National Origin: 


86 


0.27 




American Indian, Alaskan native 




Black, Afro-American 


1,052 


3.30 




Mexican-American, Chicano 


142 


0.44 




Orieatal, Asian-American 


370 


1.16 




Pue^BKRican 


148 


0.46 




, White , Caucasian 


26,029 


81.56 




F:glish a^First Language: 
Yes ' 


* 27,786 


87.07 




tfo 


737 - 


2.31 



4- — ~ 



' 28,392 
552 



154 
28,597 



69 

A 020 
78 
505 
160 

25,994 



27,829 
67? 



88.81 
1.73 



0.43 
89.45 



0.22 
3~. 19 
0.24 
1.58 
0.50 
81.30 



87.04 

2.12_ 
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APPENDIX A: Descriptive Profiles of Student Groups (continued) 



Students with 
Single Test Administration 
• (N=31,912) 
Frequency |ercent 



1 



Father's (Male Guardian's) 
Level of Education: 



Grade school 


733 


2.30 


Some high school — 


2,124 


6.66 


High school* diploma 


6,589 


20.65 


Business or trade school 


1,798 


5.63 


Some college t ~& 


4,821 


15.11 


Bachelor's degree 


4,819 


15.10 


Some graduate or professional 




4.3I 


school 


f, 397 




Graduate or professional degree 


6,023 


J18.87 



Mother's (Female Guardian's) 
Level of Education: 
Grade school 507 
Some high school 1,888 
High school diploma 10,468 
Business pr trade school 2,272 
I Some college 5,347 
j Bachelor's degree : 3,673 

Some* graduate or professional 
school 

Graduate or professional degree 

Parent's' Annual Income: 
Below $3,000 
$3,000-$5,999 
$6,000-$8,999 
$9,000-$! 1,^9 
$12,0C0-$14,999 0 
$15,000-$17,999 
$18,000-$20,999 
$21,000-$23,999 
$24,000-$26,999 
$27,000-$29,999 
$30,000-$34,999 
$35,000-$39,999 
$40,000-$44,999 
$45,0G0-$49,999 I 
$50,000 and over 



1.5 V 
5.92 
32.80 
7.12 
16.76 
11.51 



1,556 


4.88 


2,582 


* 8.09 


184 


0.58 


443 


1.39 


524 


1.64 


551 


1.73 


856 


2.68 


815 


2.55 


1,056- • 


3.31 


895 


~ 2.80 


1,326 | 


4.16 


1,082 I 


3.39 


1,696 


5.31 


2,160 


6.77 


1,467 


4.60 


1,793^ ' 


5.62 


1,125 


3.53 



Students-? with 
Repeat Test Administration 
(N=31,971) 
Frequency Percent 



707 
.1,832 
'5,635 
1,798 
4,513 
5,538 

1,643 
6,614 



497 
1,556 
9,987 
2,442 
5,195 
4,103 

1,791 
2,681 



146 
416 
392 
506 
627 
680 
- 881 
831 
1,143 
985 
1,582 
1,884 
1,434 
1,763 
1,127 



2.21 
5.73 
17.63 
5.62 
14.12 
17.32 

5.14 
20.69 



1.55 
4.87 
31.24 
7.64 
16.25 
12.83 

5.60 
8.39 



0.46 
1.30 
1.23 
1.58 
1.96 
2.13 
2.76 
2.60 
3.58 
3.08 
4.95 
5.8? 
4.49 
5.51 
3.53 



X: 



O 

ERJC 
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APPENDIX B: Regression_Cq^i^nts_foj^stim 



i 



Independent Variables 



1 SAT-V SAT-M 
(N*5,602) (N=5,436) 
Regression Regression 
Coefficients Coefficients 

B se 3 B se R 



Equating section score' 
High school rank 
Years of English study 
Latest English grade 
Years of mathematics study 
Latest mathematics grade 
Educational aspirations 
Father's level of education 
Mother r js level of education 
Public/nonpublic high school 

Constant 

Multiple correlation 
Standard error of estimate 



0.789 


85.16 


0.81 


-0.075 


-7.04 


0.82 


0.023 


4.93 


1.35 


-0.053 


-3.60 


0.53 


0.032 


4.13 


0.87 


-0.016 


-0.90 


' 0.44 


0.015 


1.21 


0.50 


0.029 


1.60 


0.39 


0.023 


1.42 


0.44 


0.003 

1 


1.00 


1.80 




418.57 






0.891 






51.538 


9 



0.735 


83.20 


0.98 


-0.082 


-8.08 


0.92 


-0.006 


-1.37 


1.52 


-0.020 


-1.45 


0.58 


0*069 - 


9.31 


1.02 


-0.058 


• -3.40 


0.50 


0.032 


2.71 


6.56 


i 0.040 


2.25 


0.44 


0.021 


1.35 


0.50 


-0.011 


-3.27 


1.99 



472.60 

0.876 

1 

56.73 J 3 



